Spatial Planning and Connectivity Corpus - Technical Background Report

IPBES Spatial Planning and Connectivity Assessment

Authors

Rainer M. Krug

Gabriella Bishop

Sebastian Villasante

Doi
Abstract

To Be added

DOI GitHub License: CC BY 4.0

Disclaimer

Contributors

Assessment Experts

  • xxx, yyy ORCID

Data and Knowledge tsu

  • Niamir, Aidin ORCID

Working Title

IPBES_SPC_Corpus

Code repo

Github repository

Build No: 68

Introduction

The literature search for the Spatial Planning and Connectivity assessment corpus was conducted using search terms provided by the experts and refined in co-operation with the IPBES task force for data and knowledge management. The search was conducted using OpenAlex, scripted from R to use the OpenAlex API. Search terms for the following searches were defined:

  • Spatial Planning and Connectivity,
  • Nature / Environment
  • additional search terms for specific corpora

To assess the quality of the corpus, sets of key papers were selected by the experts to verify if these are in the corpus.

The following terminology is used in this document:

  • Corpus: A body of works as based on a search on OpenAlex
  • Spatial Planning and Connectivity Assessment Corpus: Short: SPC corpus; The corpus resulting from the search terms TO BE ADDED
  • work: terminology used for a single document in their dataset. Each work has a unique OpenAlex id, but not necesarily a DOI.

The following searches are conducted on Title and Abstract only to account for fluctuating availability of full text searches and make the search more focussed..

Schematic Overview

Overview

flowchart TD
    Start([Start literature search]) --> SPC["spc_corpus.yaml<br/>Assemble base SPC corpus"]
    click SPC "./input/search_terms/spc_corpus.yaml" "Open spc_corpus.yaml"
    SPC --> SPC_list["spc keyword set<br/>(planning & connectivity terms)"]
    SPC --> NATURE_list["nature keyword set<br/>(environmental context terms)"]
    SPC_list --> BaseQuery["Level 1 query<br/>spc terms AND nature dictionary"]
    NATURE_list --> BaseQuery

    BaseQuery --> ChapterSelect{Apply chapter / theme refinements}

    ChapterSelect --> CH1["Chapter 1<br/>chapter_1.v1.yaml<br/>Governance & planning principles"]
    click CH1 "./input/search_terms/chapter_1.v1.yaml" "Open chapter_1.v1.yaml"
    ChapterSelect --> CH2["Chapter 2<br/>chapter_2.yaml + chapter_2_add.yaml + chapter_2_sdg.yaml<br/>GBF targets, nexus themes, SDGs"]
    click CH2 "./input/search_terms/chapter_2.yaml" "Open chapter_2.yaml"
    ChapterSelect --> CH3["Chapter 3<br/>chapter_3.yaml<br/>Restoration & conservation planning"]
    click CH3 "./input/search_terms/chapter_3.yaml" "Open chapter_3.yaml"
    ChapterSelect --> CH4["Chapter 4<br/>chapter_4.yaml<br/>Connectivity evidence & metrics"]
    click CH4 "./input/search_terms/chapter_4.yaml" "Open chapter_4.yaml"
    ChapterSelect --> CH5["Chapter 5<br/>chapter_5_* files<br/>Foresight & futures (sections, themes, cross-cutting)"]
    click CH5 "./input/search_terms/Chapter_5_1_2.yaml" "Open Chapter 5 search terms"
    ChapterSelect --> CH6["Chapter 6<br/>chapter_6.yaml (+ chapter_6_r2.yaml optional)<br/>Enabling environment"]
    click CH6 "./input/search_terms/chapter_6.yaml" "Open chapter_6.yaml"

Chapter 1

flowchart LR
    Start([SPC Corpus]) --> Ch1["chapter_1.v1.yaml<br/>Level 2 refinements"]
    click Ch1 "./input/search_terms/chapter_1.v1.yaml" "Open chapter_1.v1.yaml"

    subgraph Chapter1Sets["Chapter 1 thematic searches"]
        direction TB
        C1_1["Set 1:<br/>land/spatial planning<br/>+ biodiversity goals<br/>+ societal needs/values"]
        C1_2["Set 2:<br/>adaptive/scenario planning<br/>+ monitoring/feedback"]
        C1_3["Set 3:<br/>evidence & precaution<br/>+ ILK knowledge base"]
        C1_4["Set 4:<br/>multilevel/transparent governance<br/>+ customary coherence"]
        C1_5["Set 5:<br/>participatory planning<br/>+ co-design / engagement"]
        C1_6["Set 6:<br/>equity / rights / tenure<br/>+ justice outcomes"]
        C1_7["Set 7:<br/>connectivity (land-sea/cross-scale)<br/>+ nexus & climate links"]
    end

    Ch1 --> C1_1
    Ch1 --> C1_2
    Ch1 --> C1_3
    Ch1 --> C1_4
    Ch1 --> C1_5
    Ch1 --> C1_6
    Ch1 --> C1_7

Chapter 2

flowchart LR
    Start([SPC Corpus]) --> Ch2L2["chapter_2.yaml<br/>Level 2 GBF bundles"]
    click Ch2L2 "../search_terms/chapter_2.yaml" "Open chapter_2.yaml"

    subgraph L1_GBF["GBF contexts & targets"]
        direction TB
        GBF_Urban["GBF-1 Urban"]
        GBF_Rural["GBF-1 Rural"]
        GBF_Fresh["GBF-1 Freshwater"]
        GBF_Marine["GBF-1 Marine"]
        GBF_Restore["GBF-2 Ecosystem restoration"]
        T3["Target 3"]
        T4["Target 4"]
        T5["Target 5"]
        T6["Target 6"]
        T7["Target 7"]
        T8["Target 8"]
        T9["Target 9"]
        T10["Target 10"]
        T11["Target 11"]
        T12["Target 12"]
        T13["Target 13"]
        T14["Target 14"]
        T15["Target 15"]
        T16["Target 16"]
        T17["Target 17"]
        T18["Target 18"]
        T19["Target 19"]
        T20["Target 20"]
        T21["Target 21"]
        T22["Target 22"]
        T23["Target 23"]
        REL["Spatial Planning Related"]
    end
    Ch2L2 --> GBF_Urban
    Ch2L2 --> GBF_Rural
    Ch2L2 --> GBF_Fresh
    Ch2L2 --> GBF_Marine
    Ch2L2 --> GBF_Restore
    Ch2L2 --> T3
    Ch2L2 --> T4
    Ch2L2 --> T5
    Ch2L2 --> T6
    Ch2L2 --> T7
    Ch2L2 --> T8
    Ch2L2 --> T9
    Ch2L2 --> T10
    Ch2L2 --> T11
    Ch2L2 --> T12
    Ch2L2 --> T13
    Ch2L2 --> T14
    Ch2L2 --> T15
    Ch2L2 --> T16
    Ch2L2 --> T17
    Ch2L2 --> T18
    Ch2L2 --> T19
    Ch2L2 --> T20
    Ch2L2 --> T21
    Ch2L2 --> T22
    Ch2L2 --> T23
    Ch2L2 --> REL

    Start --> Ch2L3["chapter_2_add.yaml<br/>Level 3 nexus themes"]
    click Ch2L3 "../search_terms/chapter_2_add.yaml" "Open chapter_2_add.yaml"
    subgraph L1_NexusSets["Nexus add-ons"]
        direction TB
        Nexus_Water["Water"]
        Nexus_Food["Food"]
        Nexus_Health["Health"]
        Nexus_Climate["Climate"]
    end
    Ch2L3 --> Nexus_Water
    Ch2L3 --> Nexus_Food
    Ch2L3 --> Nexus_Health
    Ch2L3 --> Nexus_Climate

    L1_GBF --> Ch2L4["chapter_2_sdg.yaml<br/>Level 4 SDG filters"]
    L1_NexusSets --> Ch2L4
    click Ch2L4 "../search_terms/chapter_2_sdg.yaml" "Open chapter_2_sdg.yaml"
    subgraph SDGSets["SDG goal filters"]
        direction TB
        SDG1["SDG 1"]
        SDG2["SDG 2"]
        SDG3["SDG 3"]
        SDG4["SDG 4"]
        SDG5["SDG 5"]
        SDG6["SDG 6"]
        SDG7["SDG 7"]
        SDG8["SDG 8"]
        SDG9["SDG 9"]
        SDG10["SDG 10"]
        SDG11["SDG 11"]
        SDG12["SDG 12"]
        SDG13["SDG 13"]
        SDG14["SDG 14"]
        SDG15["SDG 15"]
        SDG16["SDG 16"]
        SDG17["SDG 17"]
    end
    Ch2L4 --> SDG1
    Ch2L4 --> SDG2
    Ch2L4 --> SDG3
    Ch2L4 --> SDG4
    Ch2L4 --> SDG5
    Ch2L4 --> SDG6
    Ch2L4 --> SDG7
    Ch2L4 --> SDG8
    Ch2L4 --> SDG9
    Ch2L4 --> SDG10
    Ch2L4 --> SDG11
    Ch2L4 --> SDG12
    Ch2L4 --> SDG13
    Ch2L4 --> SDG14
    Ch2L4 --> SDG15
    Ch2L4 --> SDG16
    Ch2L4 --> SDG17

Chapter 3

flowchart LR
    Start([SPC Corpus]) --> Ch3["chapter_3.yaml<br/>Level 2 refinements"]
    click Ch3 "./input/search_terms/chapter_3.yaml" "Open chapter_3.yaml"

    subgraph Chapter3Sets["Chapter 3 searches"]
        direction TB
        C3_1["Set 1:<br/>Protected/OECM + NBSAP + cases<br/>+ spatial prioritization + regional scales"]
        C3_2["Set 2:<br/>Restoration types + inclusivity & ILK"]
        C3_3["Set 3:<br/>Restoration planning + connectivity + resilience"]
        C3_4["Set 4:<br/>Systematic conservation planning / gap analysis"]
        C3_5["Set 5:<br/>Protected area & connectivity planning"]
        C3_6["Set 6:<br/>Landscape/species/corridor networks"]
        C3_7["Set 7:<br/>Conservation planning + ecosystem services"]
        C3_8["Set 8:<br/>Participatory conservation mapping"]
        C3_9["Set 9:<br/>Conservation effectiveness + spatial planning"]
        C3_10["Set 10:<br/>Adaptive management under global change drivers"]
    end
    Ch3 --> C3_1
    Ch3 --> C3_2
    Ch3 --> C3_3
    Ch3 --> C3_4
    Ch3 --> C3_5
    Ch3 --> C3_6
    Ch3 --> C3_7
    Ch3 --> C3_8
    Ch3 --> C3_9
    Ch3 --> C3_10

Chapter 4

flowchart LR
    Start([SPC Corpus]) --> Ch4["chapter_4.yaml<br/>Level 2 refinements"]
    click Ch4 "./input/search_terms/chapter_4.yaml" "Open chapter_4.yaml"

    subgraph Chapter4Sets["Chapter 4 searches"]
        direction TB
        C4_1["Set 1:<br/>Connectivity review articles"]
        C4_2["Set 2:<br/>Connectivity benefits vs risks<br/>+ ecosystem services"]
        C4_3["Set 3:<br/>Structural vs functional connectivity"]
        C4_4["Set 4:<br/>Connectivity modelling toolkits"]
        C4_5["Set 5:<br/>Connectivity indicators & metrics"]
        C4_6["Set 6:<br/>Policy & governance integration"]
        C4_7["Set 7:<br/>Multilevel / transboundary governance<br/>+ IPLC inclusion + land tenure"]
        C4_8["Set 8:<br/>Movement ecology<br/>(dispersal / migration / permeability)"]
    end
    Ch4 --> C4_1
    Ch4 --> C4_2
    Ch4 --> C4_3
    Ch4 --> C4_4
    Ch4 --> C4_5
    Ch4 --> C4_6
    Ch4 --> C4_7
    Ch4 --> C4_8

Chapter 5

flowchart LR
    Start([SPC Corpus]) --> Ch5L2["Chapter_5_1_2.yaml<br/>Level 2 foresight framing"]
    click Ch5L2 "./input/search_terms/Chapter_5_1_2.yaml" "Open Chapter_5_1_2.yaml"

    subgraph L2Sets["Section 5.1–5.2 searches"]
        direction TB
        C5_1["Set 1:<br/>Future + project*/predict*/scenario"]
        C5_2["Set 2:<br/>Future + pathway*/narrative*/vision*"]
        C5_3["Set 3:<br/>Future-proof*/anticipat*/scenario planning"]
        C5_4["Set 4:<br/>Foresight*/backcasting/simulat*/trend*"]
        C5_5["Set 5:<br/>Model* AND scenario"]
    end
    Ch5L2 --> C5_1
    Ch5L2 --> C5_2
    Ch5L2 --> C5_3
    Ch5L2 --> C5_4
    Ch5L2 --> C5_5

    Start --> Ch5L3["chapter_5_3.yaml<br/>Level 3 – Drivers of change"]
    click Ch5L3 "./input/search_terms/chapter_5_3.yaml" "Open chapter_5_3.yaml"
    subgraph L3Sets["Section 5.3 searches"]
        direction TB
        C5_3a["Set 1:<br/>Drivers of change"]
        C5_3b["Set 2:<br/>Driver modelling approaches"]
        C5_3c["Set 3:<br/>Driver gaps"]
    end
    Ch5L3 --> C5_3a
    Ch5L3 --> C5_3b
    Ch5L3 --> C5_3c

    Start --> Ch5L4["chapter_5_4.yaml<br/>Level 4 – Synergies & trade-offs"]
    click Ch5L4 "./input/search_terms/chapter_5_4.yaml" "Open chapter_5_4.yaml"
    subgraph L4Sets["Section 5.4 searches"]
        direction TB
        C5_4a["Set 1:<br/>Interactions (synergy/trade-off/nexus)"]
        C5_4b["Set 2:<br/>Response options (integrated planning / NbS)"]
        C5_4c["Set 3:<br/>Cross-scale synergy & trade-off terms"]
    end
    Ch5L4 --> C5_4a
    Ch5L4 --> C5_4b
    Ch5L4 --> C5_4c

    Start --> Ch5L5["chapter_5_5.yaml<br/>Level 5 – Uncertainty & risk"]
    click Ch5L5 "./input/search_terms/chapter_5_5.yaml" "Open chapter_5_5.yaml"
    subgraph L5Sets["Section 5.5 searches"]
        direction TB
        C5_5a["Set 1:<br/>Adaptive / transformative management"]
        C5_5b["Set 2:<br/>Uncertainty quantification"]
        C5_5c["Set 3:<br/>Tipping points & thresholds"]
        C5_5d["Set 4:<br/>Cascading risks & precaution"]
    end
    Ch5L5 --> C5_5a
    Ch5L5 --> C5_5b
    Ch5L5 --> C5_5c
    Ch5L5 --> C5_5d

    Start --> Ch5L6["chapter_5_6.yaml<br/>Level 6 – Knowledge to action"]
    click Ch5L6 "./input/search_terms/chapter_5_6.yaml" "Open chapter_5_6.yaml"
    subgraph L6Sets["Section 5.6 searches"]
        direction TB
        C5_6a["Set 1:<br/>Science-policy-practice pathways"]
        C5_6b["Set 2:<br/>ILK integration & community planning"]
        C5_6c["Set 3:<br/>Enabling factors & coordination"]
        C5_6d["Set 4:<br/>Shocks, surprises, uncertainties"]
    end
    Ch5L6 --> C5_6a
    Ch5L6 --> C5_6b
    Ch5L6 --> C5_6c
    Ch5L6 --> C5_6d

    Start --> Ch5CC["chapter_5_cc.yaml<br/>Cross-cutting themes"]
    click Ch5CC "./input/search_terms/chapter_5_cc.yaml" "Open chapter_5_cc.yaml"
    subgraph CCSets["Cross-cutting searches"]
        direction TB
        CC1["Set 1:<br/>Scales & telecoupling"]
        CC2["Set 2:<br/>Co-production & inclusion"]
    end
    Ch5CC --> CC1
    Ch5CC --> CC2

Chapter 6

flowchart LR
    Start([SPC Corpus]) --> Ch6["chapter_6.yaml<br/>Level 2 refinements"]
    click Ch6 "./input/search_terms/chapter_6.yaml" "Open chapter_6.yaml"

    subgraph Chapter6Sets["Chapter 6 searches"]
        direction TB
        C6_1["Set 1:<br/>Institutional & governance structures"]
        C6_2["Set 2:<br/>Political & strategic leadership"]
        C6_3["Set 3:<br/>Socio-cultural & stakeholder engagement"]
        C6_4["Set 4:<br/>Collaboration, trust & networks"]
        C6_5["Set 5:<br/>Financial & economic mechanisms"]
        C6_6["Set 6:<br/>Legal & policy frameworks"]
        C6_7["Set 7:<br/>Human & institutional capacity"]
        C6_8["Set 8:<br/>Data, knowledge & decision support"]
        C6_9["Set 9:<br/>Ecological & spatial planning tools"]
        C6_10["Set 10:<br/>Cross-cutting process enablers"]
    end
    Ch6 --> C6_1
    Ch6 --> C6_2
    Ch6 --> C6_3
    Ch6 --> C6_4
    Ch6 --> C6_5
    Ch6 --> C6_6
    Ch6 --> C6_7
    Ch6 --> C6_8
    Ch6 --> C6_9
    Ch6 --> C6_10

Ch6R2["chapter_6_r2.yaml<br/>Optional Level 3 filter"]
Chapter6Sets --> Ch6R2
click Ch6R2 "./input/search_terms/chapter_6_r2.yaml" "Open chapter_6_r2.yaml"
subgraph Chapter6Case["Chapter 6 searches"]
    R2["Case-study keywords<br/>(case stud*, example*, initiative*, etc.)"]
end
Ch6R2 --> R2

Type Selsection

OpenAlex contains more then 270 million works of different types. The following table shows and explains the available types and highlights which are selected to be included in the SPC Corpus.

Show the code
params$types |>
  knitr::kable(
    caption = "OpenAlex Work Types and Inclusion in the SPC Corpus",
    booktabs = TRUE,
    align = c("l", "l", "l", "c")
  ) # |>
OpenAlex Work Types and Inclusion in the SPC Corpus
Type Description Included
article Scholarly journal articles and related periodical works TRUE
book Monographs and other long-form published books TRUE
book-chapter Chapters published within edited books or proceedings TRUE
dissertation Doctoral or master level theses and dissertations TRUE
editorial Editorials and editor introductions TRUE
preprint Pre-publication manuscripts shared prior to peer review TRUE
report Technical, institutional, or policy reports TRUE
review Narrative or systematic review articles TRUE
dataset Published datasets and structured data releases FALSE
erratum Published corrections to previously released works FALSE
grant Summaries or descriptions of grant-funded projects FALSE
letter Correspondence, commentaries, and short letters FALSE
libguides Library research guides and curated bibliographies FALSE
other Works that do not align with a more specific OpenAlex type FALSE
paratext Prefaces, introductions, indexes, and other paratextual items FALSE
peer-review Formal peer review reports and evaluations FALSE
reference-entry Encyclopaedia or dictionary reference entries FALSE
retraction Notices retracting previously published works FALSE
standard Standards, protocols, and technical specifications FALSE
supplementary-materials Supplementary files accompanying primary publications FALSE
Show the code
# kableExtra::kable_styling(
#     full_width = FALSE,
#     position = "left"
# )

Methods

Assess of Individual Terms in spc and nature search terms

This assessment is done on the whole of the OpenAlex corpus and only filtered for types and not for the date range.

Show the code
fn <- file.path(params$output_dir, "searchterm_assessment_spcc.rds")

if (!file.exists(fn)) {
  result <- list(
    spc = assess_search_term_both(
      st = params$search_terms$spc_corpus$spc,
      and_term = st(params$search_terms$spc_corpus$spc_corpus$nature),
      types = params$types_filter,
      verbose = TRUE
    ),
    nature = assess_search_term_both(
      st = params$search_terms$spc_corpus$nature,
      and_term = st(params$search_terms$spc_corpus$spc_corpus$spc),
      types = params$types_filter,
      verbose = FALSE
    )
  ) |>
    saveRDS(fn)
}

Get Key Paper

Here we get key papers in a parquet database which is partitioned by:

  • found_in: the search term or openalex which is used as the filter, i.e. the key paper occurs in corpus which would result from the search term
  • id_used: the id used to testing,m either the OpenAlex id (id) or the doi (doi)
  • page: only for processing reasons

No filtering, neither by type nor by publication year is done.

The other columns are as returned by the OpenAlex API.

Show the code
#|

st <- list(
  spc = params$search_terms$spc_corpus$spc |>
    paste0(collapse = " "),
  nature = params$search_terms$spc_corpus$nature |>
    paste0(collapse = " ")
)
st$spcc <- paste0("(", st$spc, ") AND (", st$nature, ")")

dois <- params$key_papers$goldstandard$DOI[
  params$key_papers$goldstandard$DOI != ""
]
ids <- params$key_papers$goldstandard$openalex_id[
  params$key_papers$goldstandard$openalex_id != ""
]
Show the code
#|

fn <- file.path(params$keyworks, "parquet", "found_in=openalex")
if (!dir.exists(fn)) {
  ### KP in OpenAlex
  openalexPro::pro_query(
    doi = dois,
    chunk_limit = 50
  ) |>
    openalexPro::pro_request(
      output = file.path(fn, "..", "json_doi")
    ) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "..", "jsonl_doi"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=doi"),
      delete_input = TRUE
    )

  openalexPro::pro_query(
    id = ids,
    multiple_id = TRUE,
    chunk_limit = 50
  ) |>
    openalexPro::pro_request(
      output = file.path(fn, "..", "json_oa_id")
    ) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "..", "jsonl_oa_id"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=oa_id"),
      delete_input = TRUE
    )
}
Show the code
#|

fn <- file.path(params$keyworks, "parquet", "found_in=spc")
if (!dir.exists(fn)) {
  openalexPro::pro_query(
    title_and_abstract.search = st$spc,
    doi = dois,
    chunk_limit = 25
  ) |>
    openalexPro::pro_request(
      output = file.path(fn, "..", "json_doi")
    ) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "..", "jsonl_doi"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=doi"),
      delete_input = TRUE
    )

  openalexPro::pro_query(
    title_and_abstract.search = st$spc,
    id = ids,
    multiple_id = TRUE,
    chunk_limit = 25
  ) |>
    openalexPro::pro_request(output = file.path(fn, "json_oa_id")) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "jsonl_oa_id"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=oa_id"),
      delete_input = TRUE
    )
}
Show the code
#|

fn <- file.path(params$keyworks, "parquet", "found_in=nature")
if (!dir.exists(fn)) {
  ### KP in nature
  openalexPro::pro_query(
    title_and_abstract.search = st$nature,
    doi = dois,
    chunk_limit = 25
  ) |>
    openalexPro::pro_request(output = file.path(fn, "json_doi")) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "jsonl_doi"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=doi"),
      delete_input = TRUE
    )

  openalexPro::pro_query(
    title_and_abstract.search = st$nature,
    id = ids,
    multiple_id = TRUE,
    chunk_limit = 25
  ) |>
    openalexPro::pro_request(output = file.path(fn, "json_oa_id")) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "jsonl_oa_id"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=oa_id"),
      delete_input = TRUE
    )
}
Show the code
#|

fn <- file.path(params$keyworks, "parquet", "found_in=spcc")
if (!dir.exists(fn)) {
  ### KP in spcc
  openalexPro::pro_query(
    title_and_abstract.search = st$spcc,
    doi = params$key_papers$goldstandard$DOI[
      params$key_papers$goldstandard$DOI != ""
    ],
    chunk_limit = 15
  ) |>
    openalexPro::pro_request(output = file.path(fn, "json_doi")) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "jsonl_doi"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=doi"),
      delete_input = TRUE
    )

  openalexPro::pro_query(
    title_and_abstract.search = st$spcc,
    id = params$key_papers$goldstandard$openalex_id[
      params$key_papers$goldstandard$openalex_id != ""
    ],
    multiple_id = TRUE,
    chunk_limit = 25
  ) |>
    openalexPro::pro_request(output = file.path(fn, "json_oa_id")) |>
    openalexPro::pro_request_jsonl(
      output = file.path(fn, "jsonl_oa_id"),
      delete_input = TRUE
    ) |>
    openalexPro::pro_request_jsonl_parquet(
      output = file.path(fn, "id_used=oa_id"),
      delete_input = TRUE
    )
}

Keypaper in Search Terms

The in the previous step retrieved works are analysed here to get a table which shows where the key paper occur.

Show the code
fn <- file.path(params$keyworks, "kp_found_in.rds")
if (!file.exists(fn)) {
  arrow::open_dataset(file.path(params$keyworks, "parquet")) |>
    dplyr::select(
      id,
      doi,
      type,
      found_in,
      title,
      citation
    ) |>
    dplyr::group_by(id, doi, title, citation, type) |>
    dplyr::summarise(
      in_openalex = base::max(found_in == "openalex", na.rm = TRUE),
      in_spc = base::max(found_in == "spc", na.rm = TRUE),
      in_nature = base::max(found_in == "nature", na.rm = TRUE),
      in_spcc = base::max(found_in == "spcc", na.rm = TRUE),
      .groups = "drop"
    ) |>
    dplyr::collect() |>
    saveRDS(fn)
}

Get Numbers from OpenAlex of the Search Terms

These data is gathered from OpenAlex directly, not downloaded any works. The data is used to assess the quality of the TCA Corpus.

The query contains: - the search term (nature, spc, spcc) - the types selected (article, book, book-chapter, dissertation, editorial, preprint, report, review) - the date range (from 1992-01-01 to 2025-12-31)

The following counts are retrieved:

Overall hits

Show the code
#|

fn <- file.path(params$corpus, "st_hits.rds")
if (!file.exists(fn)) {
  st <- list(
    spc = params$search_terms$spc_corpus$spc |>
      paste0(collapse = " "),
    nature = params$search_terms$spc_corpus$nature |>
      paste0(collapse = " ")
  )
  st$spcc <- paste0("(", st$spc, ") AND (", st$nature, ")")

  queries <- lapply(
    st,
    function(s) {
      openalexPro::pro_query(
        title_and_abstract.search = s,
        type = params$types_filter,
        from_publication_date = params$publication_date$from,
        to_publication_date = params$publication_date$to
      )
    }
  )
  queries$openalex <- openalexPro::pro_query(
    type = params$types_filter,
    from_publication_date = params$publication_date$from,
    to_publication_date = params$publication_date$to,
  )

  pbapply::pblapply(
    queries,
    function(query) {
      query |>
        openalexR::oa_request(
          count_only = TRUE,
          verbose = TRUE
        ) |>
        unlist()
    }
  ) |>
    do.call(what = cbind) |>
    t() |>
    as.data.frame() |>
    dplyr::select(count) |>
    saveRDS(file = fn)
}

Counts per Language

Show the code
#|

fn <- file.path(params$corpus, "st_languages.rds")
if (!file.exists(fn)) {
  st <- list(
    spc = params$search_terms$spc_corpus$spc |>
      paste0(collapse = " "),
    nature = params$search_terms$spc_corpus$nature |>
      paste0(collapse = " ")
  )
  st$spcc <- paste0("(", st$spc, ") AND (", st$nature, ")")

  queries <- lapply(
    st,
    function(s) {
      openalexPro::pro_query(
        title_and_abstract.search = s,
        type = params$types_filter,
        from_publication_date = params$publication_date$from,
        to_publication_date = params$publication_date$to,
        group_by = "language"
      )
    }
  )
  queries$openalex <- openalexPro::pro_query(
    type = params$types_filter,
    from_publication_date = params$publication_date$from,
    to_publication_date = params$publication_date$to,
    group_by = "language"
  )

  pbapply::pblapply(
    queries,
    function(query) {
      query |>
        openalexR::oa_request(
          verbose = TRUE
        ) |>
        dplyr::bind_rows()
    }
  ) |>
    dplyr::bind_rows(.id = "source") |>
    dplyr::select(source, language = key_display_name, count) |>
    tidyr::pivot_wider(
      names_from = source,
      values_from = count,
      names_prefix = "count_",
      values_fill = 0
    ) |>
    dplyr::select(
      language,
      count_openalex,
      count_spc,
      count_nature,
      count_spcc
    ) |>
    dplyr::arrange(language) |>
    saveRDS(file = fn)
}

Counts per Publication Year

Show the code
fn <- file.path(params$corpus, "st_years.rds")
if (!file.exists(fn)) {
  st <- list(
    spc = params$search_terms$spc_corpus$spc |>
      paste0(collapse = " "),
    nature = params$search_terms$spc_corpus$nature |>
      paste0(collapse = " ")
  )
  st$spcc <- paste0("(", st$spc, ") AND (", st$nature, ")")

  queries <- lapply(
    st,
    function(s) {
      openalexPro::pro_query(
        title_and_abstract.search = s,
        type = params$types_filter,
        from_publication_date = params$publication_date$from,
        to_publication_date = params$publication_date$to,
        group_by = "publication_year"
      )
    }
  )
  queries$openalex <- openalexPro::pro_query(
    type = params$types_filter,
    from_publication_date = params$publication_date$from,
    to_publication_date = params$publication_date$to,
    group_by = "publication_year"
  )

  result <- pbapply::pblapply(
    queries,
    function(query) {
      query |>
        openalexR::oa_request(
          verbose = TRUE
        ) |>
        dplyr::bind_rows()
    }
  ) |>
    dplyr::bind_rows(.id = "source") |>
    dplyr::select(source, year = key, count) |>
    dplyr::mutate(year = base::as.integer(year)) |>
    tidyr::pivot_wider(
      names_from = source,
      values_from = count,
      names_prefix = "count_",
      values_fill = 0
    ) |>
    dplyr::select(
      year,
      count_openalex,
      count_spc,
      count_nature,
      count_spcc
    ) |>
    dplyr::arrange(dplyr::desc(year)) |>
    saveRDS(file = fn)
}

Results

Assessment of Search Terms

The individual terms are assessed with the second term as AND, e,g. each individual term in spc is assessed with AND nature. In addition,

SPC Term

Show the code
readRDS(file.path(params$output, "searchterm_assessment_spcc.rds"))$spc |>
  dplyr::arrange(desc(count)) |>
  dplyr::mutate(
    count = format(count, big.mark = ","),
    count_excl = format(count_excl, big.mark = ","),
  ) |>
  knitr::kable(format = "html", escape = FALSE)
term count count_excl
(planning AND ("for ecosystem services" OR process OR scenario OR tools)) 1,016,647 627,639
(scenario AND (analysis OR "based model" OR "based planning" OR building OR planning OR thinking OR "and models" OR "of land use")) 822,519 572,782
(spatial AND (composition OR configuration OR "conservation prioritisation" OR "conservation prioritization" OR decision OR development OR "forest planning" OR governance OR planning OR prioritisation OR prioritization OR transformation)) 602,413 436,320
((anticipatory OR "community based" OR "forest management" OR "land-sea" OR participatory OR "place based" OR regional OR sectorial OR territorial OR urban OR "agricultural management") AND planning) 569,869 288,303
connectivity 461,758 421,047
restoration 461,290 414,027
(land AND (allocation OR governance OR system)) 307,178 196,950
(adaptive AND (management OR planning)) 300,667 197,005
(conservation AND (practice OR planning OR program OR strategy)) 224,105 134,896
(("key biodiversity" OR protected OR "remote ocean") AND areas) 221,704 164,455
(landscape AND (complexity OR planning OR governance OR management OR "ecological planning")) 188,061 90,419
("land use" AND (decision OR planning OR governance OR management OR model OR scenario OR trade-offs OR tradeoffs)) 175,717 61,628
((integrative OR "trends and") AND scenarios) 157,786 44,471
(animal AND (migration OR movement)) 69,677 59,417
((functional OR working) AND landscapes) 57,262 33,433
(("cumulative impact" OR "environmental impact" OR "strategic environmental") AND assessment) 53,022 39,824
(ecological AND (corridor OR networks)) 48,448 28,849
zoning 47,470 34,068
"systems integration" 45,355 39,505
(biodiversity AND (assessment OR indicators OR scenario)) 42,857 18,865
((blue OR green) AND infrastructure) 34,200 21,547
("ecosystem service" AND (mapping OR model OR planning)) 23,141 5,925
((inclusive OR marine) AND governance) 22,645 14,496
"stepping stones" 15,421 13,466
(nature AND ("based solutions" OR futures)) 15,025 10,555
(integrated AND ("assessment framework" OR "assessment model" OR "coastal zone management")) 9,028 5,531
((habitat OR wildlife) AND corridor) 7,986 2,479
(("ecosystem based" OR "sea use") AND management) 5,700 2,989
"multi-criteria decision analysis" 5,568 4,062
rewilding 2,048 1,414
(("generalized dissimilarity" OR macroecological) AND model) 1,051 657
IPBES 1,011 577
"reserve design" 640 315
"futures thinking" 554 388
OECM 325 201
"critical areas for biodiversity" 17 7

Nature Term

Show the code
readRDS(file.path(params$output, "searchterm_assessment_spcc.rds"))$nature |>
  dplyr::arrange(desc(count)) |>
  dplyr::mutate(
    count = format(count, big.mark = ","),
    count_excl = format(count_excl, big.mark = ","),
  ) |>
  knitr::kable(format = "html", escape = FALSE)
term count count_excl
environment 5,628,345 4,169,826
species 3,615,729 2,254,923
nature 3,236,060 2,485,363
sustainable 2,901,442 2,070,778
soil 1,900,934 1,250,549
sea 1,251,679 753,939
forest 1,056,247 528,538
river 1,014,129 576,820
ecological 894,562 307,329
landscape 886,301 516,869
conservation 827,750 371,991
Earth 816,853 536,257
ecosystem 757,569 204,087
marine 732,912 345,603
ocean 638,250 315,888
habitat 526,700 129,189
lake 502,996 292,091
restoration 461,290 310,633
mountain 436,824 226,862
coast 405,029 186,587
biodiversity 274,104 49,065
terrestrial 250,166 89,718
flora 203,586 106,716
planet 198,843 107,449
freshwater 197,657 68,682
fauna 177,738 54,737
maritime 165,508 100,528
plantation 156,225 74,660
desert 148,279 80,520
wetland 142,734 55,441
wildlife 140,384 44,400
grassland 115,594 31,832
estuary 91,764 30,842
bog 63,942 47,301
marshes 62,247 27,301
"protected areas" 58,787 10,640
"natural resource" 57,705 23,479
"agricultural land" 52,808 19,771
meadow 50,125 15,737
woodland 48,886 13,751
biosphere 42,260 11,825
"coastal waters" 32,463 10,831
savanna 28,058 8,946
"coupled system" 24,957 20,583
dryland 24,491 9,880
peatland 20,474 8,506
"arable land" 18,026 4,736
mires 17,488 11,466
tundra 17,263 4,695
fjord 13,064 5,973
shrubland 9,550 1,171
bioeconomy 8,433 4,195
"resource system" 4,983 2,905
seascape 4,297 1,132
heathland 3,323 833
marshland 3,312 898
chaparral 2,662 1,168
"environmental resource" 2,071 735
agroforest 1,498 322
"agro-forest" 478 0

Keypaper in Corpus

No filtering, neither by type nor by publication year is done. Therefore, the pure search terms are evaluated. If a paper is included in this table, doex not mean it is included in the final SPC Corpus due to filtering by dates and types!

Show the code
readRDS(file.path(params$keyworks, "kp_found_in.rds")) |>
  dplyr::mutate(
    id_display = sub("^.*/(W[0-9]+)$", "\\1", id),
    id = sprintf("<a href=\"%s\" target=\"_blank\">%s</a>", id, id_display),
    doi_display = sub("^https://doi.org/", "\\1", doi),
    doi = sprintf("<a href=\"%s\" target=\"_blank\">%s</a>", doi, doi_display)
  ) |>
  dplyr::arrange(
    in_spcc,
    in_spc,
    in_nature,
    in_openalex
  ) |>
  dplyr::mutate(
    dplyr::across(
      dplyr::starts_with("in_"),
      ~ dplyr::case_when(
        .x ~ '<b style="color:#008000;">☑</b>', # green bold checkbox
        !.x ~ '<b style="color:#cc0000;">☐</b>' # red bold empty checkbox
      )
    )
  ) |>
  dplyr::select(
    id,
    doi,
    citation,
    in_spcc,
    in_spc,
    in_nature,
    in_openalex
  ) |>
  knitr::kable(format = "html", escape = FALSE)
id doi citation in_spcc in_spc in_nature in_openalex
W2128217744 10.1177/0306312713508669 David H. Guston (2013)
W1973628253 10.1016/j.enpol.2005.12.006 Will McDowall & Malcolm Eames (2006)
W3170399214 10.1016/j.mex.2021.101401 Daniel Beiderbeck et al. (2021)
W2964311550 10.1016/j.techfore.2019.07.002 Ian Belton et al. (2019)
W2015646994 10.1016/j.tree.2014.07.005 Carly N. Cook et al. (2014)
W4238224782 10.1016/j.futures.2015.08.007 I. Milojević & Sohail Inayatullah (2015)
W2086673960 10.1016/j.tree.2009.04.008 William J. Sutherland & Harry J. Woodroof (2009)
W4233598570 10.1146/annurev-anthro-102218-011435 David Valentine & Amelia Hassoun (2019)
W2056591496 10.1016/s0016-3287(98)00101-3 Richard A. Slaughter (1998)
W2090312322 10.1016/j.futures.2007.11.010 Joseph Voros (2007)
W1992974326 10.1016/j.futures.2007.11.011 Richard A. Slaughter (2007)
W2795786882 10.1007/s00267-018-1028-3 Ida N.S. Djenontin & Alison M. Meadow (2018)
W2904898541 10.1007/s10668-018-00300-5 Azime Tezer et al. (2018)
W1990241575 10.1007/s10021-004-0074-2 Paul Raskin (2005)
W4414957974 10.1073/pnas.2501695122 Damaris Zurell et al. (2025)
W4298615974 10.1007/s10980-022-01534-5 Tom Harwood et al. (2022)
W4210765186 10.1038/s41893-021-00844-x Roslyn Henry et al. (2022)
W2910481941 10.1080/02513625.2018.1562795 Peter Schmitt & Thorsten Wiechmann (2018)
W2116090915 10.1016/j.futures.2012.10.003 Muhammad Amer et al. (2012)
W2057938345 10.1108/14636680810855991 Sohail Inayatullah (2008)
W4386609980 10.1038/s43588-023-00503-5 Yu Zheng et al. (2023)
W4407294089 10.1080/02697459.2025.2459975 Romina Rodela (2025)
W2265414043 10.1146/annurev.ecolsys.32.081501.114012 Steward T. A. Pickett et al. (2001)
W2918948909 10.1016/j.gecco.2019.e00569 Arieanna C. Balbar & Anna Meta×as (2019)
W2099188808 10.1016/j.landurbplan.2006.04.005 Jolande W. Termorshuizen et al. (2006)
W4297536966 10.1016/j.tree.2022.09.002 Maria Beger et al. (2022)
W2027491594 10.1016/j.ecolind.2015.03.029 Christian Albert et al. (2015)
W2766942546 10.1080/08920753.2017.1373450 Kekuewa Kikiloi et al. (2017)
W3194313759 10.1007/978-94-024-1681-7 Christina von Haaren et al. (2019)
W1966247992 10.1080/21513732.2011.617711 Davide Geneletti (2011)
W3100416804 10.1111/1365-2664.13796 Virgilio Hermoso et al. (2020)
W4409833817 10.1126/science.adn2225 Jedediah F. Brodie et al. (2025)
W2884329716 10.1007/978-3-319-94021-2 László Miklós & Anna Špinerová (2018)
W2524916285 10.1002/aqc.2645 Alan M. Friedlander et al. (2016)
W3021127236 10.1016/j.marpol.2020.103950 Thomas Robertson et al. (2020)
W4413900892 10.1016/j.tree.2025.07.014 Jian Peng et al. (2025)
W4412750811 10.1016/j.marpol.2025.106852 Jean‐Eudes Beuret et al. (2025)
W4410629764 10.1016/j.rsma.2025.104257 Liisi Lees et al. (2025)
W4281717750 10.1126/science.abl8974 Angela Brennan et al. (2022)
W1963746476 10.1016/j.ecocom.2009.10.006 R.S. de Groot et al. (2009)
W4211243502 10.1038/35012251 Chris Margules & Robert L. Pressey (2000)
W2759207970 10.3390/su9091668 Leena Karrasch et al. (2017)
W3159060995 10.1016/j.ecoser.2021.101273 Karsten Grunewald et al. (2021)
W2487200415 10.1016/j.marpol.2016.06.023 Elianny Domínguez-Tejo et al. (2016)
W2956763155 10.1088/1748-9326/ab3234 Annika T. H. Keeley et al. (2019)
W3200717663 10.1007/s10980-021-01329-0 Jianquan Dong et al. (2021)
W4313593374 10.1016/j.mex.2022.101989 Holly Kirk et al. (2023)
W4387055605 10.1038/s44183-023-00022-w Julie Reimer et al. (2023)
W3215595594 10.1093/biosci/biab091 Abigail J. Lynch et al. (2021)
W4393866290 10.3390/su16072965 Qiqi Hu et al. (2024)
W2554309037 10.1111/btp.12386 Agnieszka E. Latawiec et al. (2016)
W3209482460 10.25607/obp-1666 Alejandro Iglesias-Campos et al. (2021)
W4320016094 10.1007/978-3-031-15773-8_4 Falko Buschke et al. (2023)
W4295308933 10.1073/pnas.2203385119 Natalia Estrada-Carmona et al. (2022)
W4412257196 10.5281/zenodo.6522392 Unai Pascual et al. (2022)
W4415164578 10.1098/rsos.250810 Rachael Garrett et al. (2025)
W2114666631 10.1525/bio.2010.60.3.7 Timothy J. Beechie et al. (2010)
W2612157793 10.1016/j.marpol.2017.06.020 Mara Ntona & Elisa Morgera (2017)
W2035832288 10.1007/s10980-014-0085-0 Christian Albert et al. (2014)
W2604803374 10.1093/biosci/bix012 Charles H. Nilon et al. (2017)
W4220936177 10.1016/j.oneear.2022.02.008 Christopher M. Raymond et al. (2022)
W2999493939 10.1016/j.landurbplan.2019.103741 Christian Albert et al. (2020)
W2766457534 10.1016/j.landusepol.2017.10.017 Chiara Cortinovis & Davide Geneletti (2017)
W4376627395 10.1016/j.marpol.2023.105655 Julie Reimer et al. (2023)
W4211108983 10.1007/978-3-030-20024-4 Davide Geneletti et al. (2019)
W2804302736 10.1016/j.scitotenv.2018.05.147 Maria da Luz Fernandes et al. (2018)
W3134065015 10.1016/j.envsci.2021.02.001 Davide Longato et al. (2021)
W4392293717 10.1016/j.ecolind.2024.111816 Wen Song et al. (2024)
W2046569818 10.1007/s10980-014-0052-9 Christine Fürst et al. (2014)
W2810876831 10.3897/rio.4.e28045 Evelyn Underwood et al. (2018)
W2902345284 10.1007/s10980-018-0745-6 Marcin Spyra et al. (2018)
W3158973852 10.1016/j.landurbplan.2021.104129 Chiara Cortinovis et al. (2021)
W2771003204 10.1080/21513732.2017.1396257 Christine Fürst et al. (2017)
W2592317409 10.1080/21513732.2017.1296494 Justice Nana Inkoom et al. (2017)
W4406477943 10.1007/s11252-024-01656-5 Israa H. Mahmoud et al. (2025)
W4289516997 10.1007/s41207-022-00315-5 Georgia Pozoukidou et al. (2022)
W4404646473 10.1007/s00267-024-02086-x Nina Farwig et al. (2024)
W2161139387 10.1126/science.1196624 Henrique M. Pereira et al. (2010)
W2023339029 10.1046/j.1523-1739.2003.01491.x Garry Peterson et al. (2003)
W2971398159 10.1111/rec.13035 George D. Gann et al. (2019)
W2062482344 10.1073/pnas.1201040109 Joshua Goldstein et al. (2012)
W1540682446 10.1111/brv.12008 Aija S. Kukkala & Atte Moilanen (2012)
W2028797766 10.1016/j.cosust.2013.05.002 Ralf Seppelt et al. (2013)
W2470861343 10.1016/j.landurbplan.2016.05.003 Adrienne Grêt‐Regamey et al. (2016)
W2754686867 10.1038/s41559-017-0273-9 Isabel M.D. Rosa et al. (2017)
W3185562566 10.1038/d41586-021-02041-4 Georgina G. Gurney et al. (2021)
W4395447577 10.1126/science.adn3441 Henrique M. Pereira et al. (2024)
W4212886831 10.1111/geb.13459 Karel Mokany et al. (2022)
W4406904698 10.1016/j.tree.2024.12.002 Sylvaine Giakoumi et al. (2025)
W4406487214 10.1007/s10980-024-02042-4 Jiangxiao Qiu et al. (2025)
W2015619301 10.1073/pnas.1000530107 Lian Pin Koh & Jaboury Ghazoul (2010)
W3126018057 10.1111/rec.13346 Jordi Cortina et al. (2021)
W2156111479 10.1111/gcb.12383 Sebastián Martinuzzi et al. (2013)
W2902463689 10.1016/j.tree.2018.10.006 Emily Nicholson et al. (2018)
W2952896657 10.5194/gmd-11-4537-2018 Hyejin Kim et al. (2018)
W3161819570 10.1126/science.abc4896 Louise O’Connor et al. (2021)
W4382181311 10.1007/s11625-023-01316-1 América Paz Durán et al. (2023)
W2769947232 10.1016/j.cosust.2017.10.004 Jean Paul Metzger et al. (2017)
W3151200781 10.1111/rec.13403 Ben L. Gilby et al. (2021)
W56780107 10.1007/978-1-4612-0529-6_10 Jack Ahern (1999)
W2159760863 10.1111/j.1466-8238.2010.00620.x Joachim H. Spangenberg et al. (2011)
W4382751812 10.1016/j.biocon.2023.110068 Marcel Kok et al. (2023)
W3044450114 10.1016/j.envsoft.2020.104806 Andrew J. Hoskins et al. (2020)
W3074014079 10.1007/s10113-020-01685-8 Clara J. Veerkamp et al. (2020)
W3145276257 10.1016/j.jenvman.2021.112400 Yuyoung Choi et al. (2021)
W2614376759 10.1016/j.envsci.2017.05.003 Vanessa M. Adams et al. (2017)
W4386734502 10.1146/annurev-environ-112321-095011 Steven J. Cork et al. (2023)
W2073677603 10.1016/j.biocon.2015.02.015 Sebastián Martinuzzi et al. (2015)
W2972648221 10.1111/1365-2664.13506 Karel Mokany et al. (2019)
W4411717980 10.1007/s11625-025-01682-y Sana Okayasu et al. (2025)
W4310004272 10.1007/s11625-022-01251-7 Lucas Rutting et al. (2022)
W1990273831 10.1126/science.1242552 Silke Bauer & Bethany J. Hoye (2014)
W2784805535 10.1126/science.aam9712 Marlee A. Tucker et al. (2018)
W3085078608 10.1038/s41467-020-18457-x Michelle Ward et al. (2020)
W1880093763 10.1111/ecog.01507 Rafael A. Magris et al. (2015)
W2783683111 10.1016/j.biocon.2017.12.020 Santiago Saura et al. (2018)
W2791599583 10.1111/conl.12439 Rafael A. Magris et al. (2018)
W2965201645 10.1016/j.biocon.2019.07.028 Santiago Saura et al. (2019)
W3132073286 10.1016/j.biocon.2021.109008 Annika T. H. Keeley et al. (2021)
W4390696637 10.1038/s41467-023-43832-9 Rachel Neugarten et al. (2024)
W4302293680 10.1111/csp2.12823 David M. Theobald et al. (2022)
W4402144060 10.1002/ece3.70231 Amanda Liczner et al. (2024)
W4412704230 10.1073/pnas.2410937122 Robin Naidoo et al. (2025)
W4280541891 10.3389/fevo.2022.830822 Sylvia Wood et al. (2022)
W4392367143 10.1111/brv.13066 Steven J. Cooke et al. (2024)
W4410131928 10.3354/meps14888 Susanne E. Tanner et al. (2025)
W4413449508 10.1038/s41467-025-63205-8 Jedediah F. Brodie et al. (2025)
W4414925393 10.1016/j.tree.2025.09.007 Sandra Neubert et al. (2025)
W2255223904 NA Bob Scholes (2010)
W2978599153 NA Peter H. Verburg et al. (2019)
W4415240687 10.1007/s10980-025-02210-0 Tamsin L. Woodman et al. (2025)
W3085006993 10.1002/pan3.10146 Laura Pereira et al. (2020)
W2093445010 10.1126/science.1258832 Jianguo Liu et al. (2015)
W4380362703 10.1016/j.gloenvcha.2023.102681 Hyejin Kim et al. (2023)
W4210386268 10.1016/j.envsci.2022.01.013 Andressa V. Mansur et al. (2022)
W2122104680 10.1111/j.1523-1739.2009.01212.x William J. Sutherland et al. (2009)

SPC Corpus Measures and Numbers

These data is gathered from OpenAlex directly, not downloaded any works. The data is used to assess the quality of the TCA Corpus.

The query contains: - the search term (nature, spc, spcc) - the types selected (article, book, book-chapter, dissertation, editorial, preprint, report, review) - the date range (from 1992-01-01 to 2025-12-31)

Overall counts

Show the code
readRDS(file.path(params$corpus, "st_hits.rds")) |>
  dplyr::mutate(
    count = format(count, big.mark = ",")
  ) |>
  knitr::kable(format = "html", escape = FALSE)
count
spc 4,540,713
nature 20,174,128
spcc 2,220,221
openalex 206,359,476

Publication Years

Show the code
readRDS(file.path(params$corpus, "st_years.rds")) |>
  dplyr::mutate(
    dplyr::across(
      dplyr::starts_with("count_"),
      ~ base::format(.x, big.mark = ",")
    )
  ) |>
  knitr::kable(format = "html", escape = FALSE)
year count_openalex count_spc count_nature count_spcc
2025 6,329,647 269,512 963,820 153,565
2024 8,670,872 344,987 1,265,778 184,319
2023 8,738,795 336,209 1,234,064 167,936
2022 7,788,332 286,117 1,094,753 140,819
2021 8,727,338 289,326 1,116,499 139,635
2020 9,568,427 275,312 1,089,285 131,475
2019 9,309,954 234,536 980,923 110,162
2018 9,088,145 213,301 910,018 100,204
2017 9,036,442 194,774 846,609 90,810
2016 9,223,797 188,189 824,278 88,225
2015 8,995,848 184,968 815,518 87,337
2014 8,903,032 180,964 819,518 86,340
2013 8,629,314 171,217 790,683 81,682
2012 8,166,331 156,660 741,051 75,505
2011 7,898,106 146,247 695,938 70,195
2010 7,364,775 132,282 651,249 63,836
2009 6,892,547 116,420 586,378 55,990
2008 6,377,760 104,273 526,909 49,988
2007 5,927,238 91,643 481,199 44,084
2006 5,588,856 85,041 447,241 40,650
2005 5,119,892 75,496 404,987 36,302
2004 4,686,841 65,745 363,301 32,031
2003 4,356,515 59,870 335,471 28,914
2002 4,165,034 58,792 327,544 27,727
2001 3,574,270 42,437 260,635 20,958
2000 3,406,355 38,876 247,224 19,021
1999 2,957,468 32,716 207,587 15,798
1998 2,819,905 29,627 195,158 14,253
1997 2,690,018 27,100 183,103 12,911
1996 2,561,448 25,907 177,272 12,022
1995 2,385,363 23,621 163,935 11,050
1994 2,248,063 21,393 150,096 9,806
1993 2,132,163 19,446 143,412 8,709
1992 2,030,586 17,709 132,692 7,962

This graph only shows the relative nuymber of publications per year to identify different trends.

Show the code
readRDS(file.path(params$corpus, "st_years.rds")) |>
  tidyr::pivot_longer(
    cols = dplyr::starts_with("count_"),
    names_to = "source",
    values_to = "count",
    names_prefix = "count_"
  ) |>
  # scale each source so its total sum is 1
  dplyr::group_by(source) |>
  dplyr::mutate(
    total_count = base::sum(count, na.rm = TRUE),
    count = dplyr::if_else(total_count > 0, count / total_count, 0)
  ) |>
  dplyr::ungroup() |>
  dplyr::select(-total_count) |>
  ggplot2::ggplot(ggplot2::aes(x = year, y = count, color = source)) +
  ggplot2::geom_line(linewidth = 1) +
  ggplot2::geom_point(size = 1.5) +
  ggplot2::scale_y_continuous(
    labels = scales::label_percent(accuracy = 1),
    expand = ggplot2::expansion(mult = c(0.02, 0.06))
  ) +
  ggplot2::labs(
    x = "Year",
    y = "Share of total works (sum-scaled per source)",
    color = "Source",
    title = "Publications per year by source (each source sums to 1)"
  ) +
  ggplot2::theme_minimal(base_size = 14) +
  ggplot2::theme(
    legend.position = "bottom",
    panel.grid.minor = ggplot2::element_blank()
  )

Language

Show the code
readRDS(file.path(params$corpus, "st_languages.rds")) |>
  dplyr::mutate(
    dplyr::across(
      dplyr::starts_with("count_"),
      ~ base::format(.x, big.mark = ",")
    )
  ) |>
  knitr::kable(format = "html", escape = FALSE)
language count_openalex count_spc count_nature count_spcc
Afrikaans 152,064 186 1,881 85
Albanian 13,463 5 159 3
Arabic 629,690 739 4,002 316
Bengali 2,831 5 59 4
Bulgarian 52,160 119 805 61
Catalan 548,816 1,374 9,242 658
Chinese 4,514,153 620 9,036 275
Croatian 327,752 571 5,455 234
Czech 411,672 148 1,491 86
Danish 272,371 215 4,474 54
Dutch 760,453 718 6,090 223
English 147,226,131 4,412,876 19,339,969 2,170,343
Estonian 107,444 111 1,518 60
Finnish 172,580 175 1,183 59
French 5,683,321 19,313 194,313 7,893
German 5,126,898 5,530 25,172 1,709
Gujarati 433 0 3 0
Hebrew 3,853 24 82 13
Hindi 10,137 2 27 0
Hungarian 132,392 546 2,705 263
Indonesian 3,065,335 30,700 130,878 10,606
Italian 1,485,985 2,461 15,210 1,275
Japanese 6,356,671 1,253 10,635 448
Kannada 384 0 0 0
Korean 3,773,377 1,502 12,173 555
Latvian 25,364 13 126 8
Lithuanian 95,950 362 1,198 189
Macedonian 27,117 48 215 14
Malayalam 308 0 14 0
Marathi 2,225 1 8 1
Modern Greek (1453-) 139,882 185 1,215 71
Nepali (macrolanguage) 3,418 6 59 3
Norwegian 252,659 141 2,123 35
Panjabi 118 0 1 0
Persian 493,319 192 2,037 75
Polish 1,119,116 604 5,555 251
Portuguese 3,992,720 7,622 52,248 4,052
Romanian 259,537 252 3,855 115
Russian 2,267,290 11,177 50,456 5,514
Slovak 51,243 38 281 18
Slovenian 104,086 189 1,297 98
Somali 19,578 6 77 4
Spanish 7,150,614 33,028 208,043 11,534
Swahili (macrolanguage) 24,088 5 145 1
Swedish 431,008 426 2,087 86
Tagalog 90,423 98 1,093 13
Tamil 3,050 4 108 1
Telugu 64 0 2 0
Thai 95,822 2,155 4,062 708
Turkish 868,717 986 7,453 516
Ukrainian 508,196 1,287 8,113 585
Urdu 2,608 0 4 0
Vietnamese 164,519 159 1,535 88
Welsh 42,954 8 152 2
NA 26,847 2 30 0

This graph only shows the relative number of publications per year to identify different trends.

Show the code
readRDS(file.path(params$corpus, "st_languages.rds")) |>
  # keep top 15 languages by total (before scaling)
  dplyr::mutate(
    total = count_openalex + count_spc + count_nature + count_spcc
  ) |>
  dplyr::slice_max(total, n = 15) |>
  dplyr::arrange(dplyr::desc(total)) |>
  # fix display order so largest stays on top
  dplyr::mutate(language = factor(language, levels = rev(language))) |>
  dplyr::select(-total) |>
  # reshape wide → long
  tidyr::pivot_longer(
    cols = dplyr::starts_with("count_"),
    names_to = "source",
    values_to = "count",
    names_prefix = "count_"
  ) |>
  # scale so each source sums to 1
  dplyr::group_by(source) |>
  dplyr::mutate(
    total_source = base::sum(count, na.rm = TRUE),
    count = dplyr::if_else(total_source > 0, count / total_source, 0)
  ) |>
  dplyr::ungroup() |>
  dplyr::select(-total_source) |>
  ggplot2::ggplot(ggplot2::aes(x = language, y = count, fill = source)) +
  ggplot2::geom_col(position = "dodge") +
  ggplot2::coord_flip() +
  ggplot2::scale_y_continuous(
    labels = scales::label_percent(accuracy = 1),
    expand = ggplot2::expansion(mult = c(0, 0.05))
  ) +
  ggplot2::labs(
    x = "Language",
    y = "Share of total works within each source",
    fill = "Source",
    title = "Publications by language (top 15), scaled so each source sums to 1"
  ) +
  ggplot2::theme_minimal(base_size = 14) +
  ggplot2::theme(
    legend.position = "bottom",
    panel.grid.minor = ggplot2::element_blank()
  )

Reuse

Citation

BibTeX citation:
@report{krug,
  author = {Krug, Rainer M. and Bishop, Gabriella and Villasante,
    Sebastian},
  title = {Spatial {Planning} and {Connectivity} {Corpus} - {Technical}
    {Background} {Report}},
  doi = {10.5281/zenodo.XXXXX},
  langid = {en},
  abstract = {To Be added}
}
For attribution, please cite this work as:
Krug, Rainer M., Gabriella Bishop, and Sebastian Villasante. n.d. “Spatial Planning and Connectivity Corpus - Technical Background Report.” IPBES Spatial Planning and Connectivity Assessment. https://doi.org/10.5281/zenodo.XXXXX.